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STRONG PSEUDOPRIMES TO TWELVE PRIME BASES 


JONATHAN SORENSON AND JONATHAN WEBSTER 

Abstract. Let xjjm be the smallest strong pseudoprime to the first m prime 
bases. This value is known for 1 < m < 11. We extend this by finding ' 0 i 2 
and 'ipis- We also present an algorithm to find all integers n < B that are 
strong pseudoprimes to the first m prime bases; with a reasonable heuristic 
assumption we can show that it takes at most 52 / 3 + 0 ( 1 ) time. 


1. Introduction 

Fermat’s Little Theorem states that if n is prime and gcd(a,n) = 1, then 

= 1 (mod n). 

A composite number for which this congruence holds is called a pseudoprime to 
the base a. We can restate Fermat’s Little Theorem by algebraically factoring 
(repeatedly) the difference of squares that arises in — 1. In which case, if n is 
an odd prime, writing n — 1 = 2®d with d odd, and gcd(a, n) = 1, then either 

a'^ = 1 (mod n) 
or 

= —1 (mod n) 

for some integer k with 0 < fc < s. If a composite n satisfies the above, we call 
n a strong pseudoprime to the base a. Unlike Carmichael numbers, which are 
pseudoprimes to all bases, there do not exist composite numbers that are strong 
pseudoprimes to all bases. This fact forms the basis of the Miller-Rabin probabilistic 
primality test m- Define tpm to be the smallest integer that is a strong pseudoprime 
to the first m prime bases. (See A014233 at oeis.org.) 

The problem of finding strong pseudoprimes has a long history. Pomerance, Sel¬ 
fridge, and Wagstaff [12] computed ipm for m = 2,3, and 4. Jaeschke |7] computed 
'0m for TO = 5, 6, 7, and 8. By looking at a narrow class of numbers, Zhang [18] gave 
upper bounds on '0^ for 9 < to < 19. Recently, Jian and Deng [5] verified some of 
Zhang’s conjectures and computed 0m for to = 9,10, and 11. We continue in this 
effort with the following. 
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Theorem 1.1. 


= 3186 65857 83403 11511 67461. 
ipi3 = 33170 44064 67988 73859 61981. 

We also verified Jian and Deng’s work in finding ijjg = V'lo = V’li- Note that the 
ERH implies ipm > exp(A/TO/2), if tjjm exists [T]. 

The proof of our theorem relies primarily on running an improved algorithm for 
finding strong pseudoprimes: 

Theorem 1.2. Given a bound B > 0 and an integer m > 0 that grows with B, 
there is an algorithm to find all integers < B that are products of exactly two primes 
and are strong pseudoprimes to the first m prime bases using at most 
arithmetic operations. 

We have a heuristic argument extending this 52 / 3 + 0 ( 1 ) j-^j^ning time bound to 
integers with an arbitrary number of prime divisors. This result is an improvement 
over a heuristic 0(5®/^^) from [5]. Our algorithm uses the work of Jian and Deng 
as a starting point, and our specific improvements are these: 

• We applied the results in Bleichenbacher’s thesis [2] to speed the computa¬ 
tion of GCDs. See j ]2.2| 

• We applied the space-saving sieve mia which enabled us to sieve using 
full prime signature information. See §2.3[ 

• We used a simple table of linked-lists indexed by a hash function based on 
prime signatures. We used this to store small primes by signature for very 
fast lookup. See §2.4| 

• When it made sense to do so, we parallelized our code and made use of Big 
Dawg, Butler University’s cluster supercomputer. 

Our changes imply that adding more prime bases for a pseudoprime search (that 
is, making m larger) speeds up the algorithm. 

The rest of this paper is organized as follows. Section 2 contains a description 
of the algorithm and its theoretical underpinnings. Section 3 contains the proof of 
the running time. Section 4 contains implementation details. 

2. Algorithmic Theory and Overview 

To find all n < 5 that are strong pseudoprimes to the first m prime bases, we use 
some well-known conditions to limit how many such integers n we need to check. 
This is outlined in the first subsection below. 

We construct all possibilities by generating n in factored form, n = pip 2 ■ ■ .pt-iPt, 
where t is the number of prime divisors of n, and pi < Pi+i- It turns out that t 
cannot be too big; we only had to test up to t = 6 . So we wrote separate programs 
for each possible t and the case t = 2 dominates the running time. 

Let k = PiP 2 ’ '' Pt-i- Our algorithms generate all candidate k values, and 
then for each k search for possible primes pt to match with k, to form a strong 
pseudoprime candidate n = kpt- 

For the bound 5, we choose a cutoff value X. For k < X, we use a GCD 
computation to search for possible primes pt, if they exist. For k > X, we use 
sieving to find possible primes pt- Again, we use separate programs for GCDs and 
for sieving. 
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Once each n is formed, we perform a strong pseudoprime test for the first m 
prime bases to see if we have, in fact, found what we are looking for. 

2.1. Conditions on (Strong) Pseudoprimes. For the purposes of this paper, 
it suffices to check square-free odd integers. A computation by Dorais and Klyve 
[3] shows that a pseudoprime to the bases 2 and 3 must be square-free if it is less 
than 4.4 • 10^^. Since the values Zhang conjectured as candidates for '!/)i 2 and ' 0 i 3 
are less than 4.4 • 10^^, we restrict our search to square-free numbers. 

In general, to rule out integers divisible by squares, it suffices to find all Wieferich 
primes < , and for each such prime, perform appropriate tests. This can easily 

be done in well under time. See §3.6.1 in [2]. 

To each prime p we associate a vector called its signature. Let u = (oi, 02 ,..., am) 
for Ui distinct positive integers. Then the signature of p is 

o’p = (i^(ordp(ai)),u(ordp(a 2 )),...,p(ordp(am))), 

where v denotes valuation with respect to 2 and ordp(a) is the multiplicative order 
of a modulo p. 

Example 2.1. Let p be the prime 151121. Then = (3,4,0,4, 2,1, 2,4) if u = 
(2,3,5,7,11,13,17,19). 

Theorem 2.2 (Proposition 1 in [7]). Let n = pi---pt with t distinct primes, 
V = ( 01 , 02 ,... ,Om) with different integers such that pi j aj for all 1 < i < t and 
1 < j < m. Then n is a strong pseudoprime to each base in v if and only if n is a 
pseudoprime to each base in v and ap_^ = • • • = . 

If t > 2, then this theorem greatly limits the number of candidate k values we 
need to check. In the case t = 2 the initial sub-product k is just a single prime, 
so this theorem does not, at first, appear to help. However, it does play a role in 
sieving for pt. For the rest of this paper, we use v as the vector containing the first 
m = 11 or 12 prime numbers. 

2.2. Greatest Common Divisors. As above, let fc = pi • • -pt-i- We explain in 
this subsection how to find all possible choices for pt to form n = kpt as a strong 
pseudoprime to the first m prime bases using a simple GCD computation. 

Let b be one of the oi from v. The pseudoprimality condition b^P*~^ = 1 
(mod kpt) implies that 

i.e., that pt divides b^~^ — 1. Considering multiple bases, we know that pt must 
divide 


gcd (4-i-l,a^i-l,...,at-'-l). 

However, since we are concerned with strong pseudoprimes, we can consider only 
the relevant algebraic factor of b^~^ — 1. Following Bleichenbacher’s notation [2], 
let fc — 1 = u2'^ for u odd and 


r 5“ — 1 if u(ordfc(6)) = 

\ 1 ^(ordfc(6)) = c > 0 
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Theorem 2.3 (Theorem 3.23 of [2])- A necessary condition for an integer n of the 
from n = kpt, for a given k with pt prime, to he a strong pseudprime to the bases 
is that Pt divide gcd{h{ai,k),... ,h{am,k)). 

This theorem allows us to construct the product of all possible pt values for a 
given k by computing a GCD. To carry this out, we figure out which of the h{ai, k) 
values would be the smallest, compute that, and then compute the next smallest 
h value modulo the smallest, perform a GCD, and repeat until we get an answer 
< fc, or use up all the at base values. 


Let fc = 151121, 

which is prime. 

We have 

x(ordfc(2)) = 3 

^ 237^®° + 1 ! 

« 8.189 X 10113^2 

x(ordfc(3)) = 4 

^75560 _j_ 2 , 

« 1.914 X 1036O51 

x(ordfc(5)) = 0 

^9445 2 , 

« 5.911 X 10®3°^ 


In this case, we would begin the computations with — 1 since it is the smallest. 

We then compute gcd(2^^^®° + 1 mod — 1) = 151121. This tells 

us that any n of the form 151121 ■ pt is not a strong pseudoprime to the vector 
1 /= (2,3,5). 


These computations start off easy as k is small and progressively get harder (in 
an essentially linear fashion) as k grows. As a practical matter, these computations 
only need to be done once. As an example, [5] found 

84 98355 74122 37221 = 206135341 • 412270681 


to be a strong pseudoprime to the first eight prime bases. When considering pi = 
206135341, they sieved (which we explain in the next section) for p 2 = 412270681. 
It is reasonable to ask whether there is a larger prime that can be paired with pi 
that would form a strong pseudoprime to eight bases. In our computation we check 
by computing a GCD to answer “no.” 


Algorithm 1: Using GCDs to rule out k values 

Input: Integers t and k, where t is the number of prime divisors and k is the prod¬ 
uct of t — 1 primes with matching signature, and the base vector v containing 
the m smallest primes. 

1 : Let b £ u give the smallest estimated value for h{b, k) as defined above. 

2 : If 6 = oi, let i = 2 else let i = 1. 

3: Compute h{b,k). 

4: Compute X = h{ai, k) mod h{b, k) using modular exponentiation. 

5; Set X = gcd(li( 6 , fc), a;); 

6 : while X > k and i < m do 

7: Set i = i + 1; A at = b set i = i + 1 again. 

8 : Compute y = h{ai, fc) mod x. 

9: Set X = gcd(x, y). 

10: end while 

11: if X < fc then 

12 : We can rule out fc. 

13: else 

14: We factor x and check each prime pt \ x with pt > k to see if kpt is a strong 

pseudoprime. 
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15: end if 


Computing h{b,k) is the bottleneck of this algorithm. When using m = 11 or 
higher, we never found anything to factor in the last step. 

2.3. Sieving. As above, let k = pi ■ ■ -pt-i and we try to construct n = kpt by 
finding pt- In order to search up to a bound B, we need to search primes pt in 
the interval [pt-i, B/k], We combine two types of sieving to search this interval 
quickly. The first we call A-sieving, and the second we call signature sieving. For 
each prime p define 

Xp = 1cm {ordp(a) : a G u} . 

By the r-rank Artin Conjecture m, we expect with very high probability that 
Xp = p — 1. Let A = 1cm { Ap^,..., Apj_j }. Since = 1 (mod n) for any a G u 

this means kpt — 1 is a multiple of A. So, pt = k~^ (mod A). This tells us that 
Pt lies in a specific residue class modulo A, and we call searching for pt using this 
value X-sieving. 

We can further narrow down the residue classes that pt may lie in with respect 
to primes in v. The following two propositions appeared originally in [7]. 

Proposition 2.5. For primes p and q, if v{p — 1) = v{q — 1) and n(ordp(a)) = 


(ord,(a)) then (^) = (^) 


V 


Proposition 2.6. For primes p and q, if v{p — 1) < v{q — 1) and n(ordp(a)) = 



One may consider higher reciprocity laws, and Jaeschke did so [7]. We found 
quadratic reciprocity sufficient. 

The authors of [5] consider searching for pt modulo lcm{A, 9240}. They consid¬ 
ered 30 residue classes modulo 9240 = 8 • 3 • 5 • 7 • 11 that arose from the propositions 
above. With the inclusion of each additional prime from u, we effectively rule out 
half of the cases to consider, thereby doubling the speed of the sieving process. 

Ideally, we would like to include every prime in v for the best performance, but 
with normal sieving, the sieve modulus becomes quite large and requires we keep 
track of too many residue classes to be practical. Instead, we adapted the space¬ 
saving wheel datastructure [El [15], which had been used successfully to sieve for 
pseudosquares. Sieving for primes pt with specified quadratic character modulo a 
list of small primes is the same algorithmic problem. This wheel sieve uses a data 
structure that takes space proportional to ® instead of space proportional to 
riagy o as used in traditional sieving. It does, however, produce candidate primes 
Pt out of order. This is not an issue, so long as we make sure the sieve modulus 
does not exceed B. In practice, we dynamically include primes from u so that 
1000 • k ■ lcm{A, 8, 3,5, 7,11,13,17,19, 23, 29, 31} < i/'ia, and we favor the inclusion 
of smaller primes. 

Algorithm 2: Sieving 

Input: k in factored form as fc = piP 2 ■ ■ ’Pt-i, Xp- for i < t, search bound B, and 
base vector v of length m. 

1 : Compute Xk = lcm{Ap.}. 

2: Set w = 1. Compute wheel modulus w as follows: 
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3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 
17 


18: 

19: 

20 : 

21 : 

22 : 


for z = 2,..., m do 

if kXkttiW < B/IOOO then 

if Qi does not divide Xk then 
Set w = w ■ Ui- 

end if 
end if 
end for 

if Xk divisible by 4 then 
Set A = Xk 

else 


w we have primes 


Set A = Afc/2 and 
Set w = 8w 

end if 

Let a be the signature of pi. 

Build a wheel with modulus w so that for each prime power q 
p generated by the wheel with {p/q) consistent with a as in Propositions 2.5 
and 12.61 

for each residue r modulo w generated by the wheel do 

Use the Chinese Remainder Theorem to compute x mod wX such that x = 
r mod w and x = k~^ mod A. 

Sieve for primes in the interval [k + 1, B/k] that are = x mod wX. 

For each such probable prime pt found, perform a strong pseudoprime test 
on kpt- 

end for 


Note that if k consists entirely of primes that are = 3 mod 4, then all primes pt 
found by the sieve, when using all of v, will have the correct signature. 

Example 2. 7. Consider p = 3 • 10® + 317. Now, A = p - 1 = 2^ • 7 • 11 • 23 • 42349 
(factorization is obtained via sieving). We may use signature information for the 
primes 3, 5, 13, 17, 19, 29, 31, and 37. However, since B = 4 ■ 10^^ we may only 
use the first 4 primes in the signature before the combined modulus is too large. 
The signature information for 3 implies that q = 5,7 (mod 12) however, the case 
q = 7 (mod 12) need not be considered since we know (because of A) that q = I 
(mod 4). Given the signature information for the primes 5 and 13, we search in 
{2,3 (mod 5)} and {2,5,6,7,8,11 (mod 13)}. At this point we sieve for primes 
<7=1 (mod A). To this we add the following 12 new congruence conditions and 
sieve for each individual condition. 

(1) Let q = 2 (mod 5). 

(a) Let q = 2 (mod 13). 

(b) Let q = 5 (mod 13). 

(c) Let <7 = 6 (mod 13). 

(d) Let q=7 (mod 13). 

(e) Let <7 = 8 (mod 13). 

(f) Let < 7=11 (mod 13). 

(2) Let 9 = 3 (mod 5). 

(a) Let < 7=2 (mod 13). 

(b) Let < 7=5 (mod 13). 

(c) Let <7 = 6 (mod 13). 





STRONG PSEUDOPRIMES TO TWELVE PRIME BASES 


7 


(d) Let q=7 (mod 13). 

(e) Let q = 8 (mod 13). 

(f) Let g = 11 (mod 13). 

2.4. Hash Table Datastructure. In order to compute GCDs or sieve as described 
above, it is necessary to construct integers k = PiP 2 ■ • 'Pt-i in an efficient way, 
where all the primes pi have matching signatures. We do this by storing all small 
primes in a hash table datastructure that supports the following operations: 

• Insert the prime p with its signature dp. We assume any prime being 
inserted is larger than all primes already stored in the datastructure. (That 
is, insertions are monotone increasing.) We also store Xp with p and its 
signature for use later, thereby avoiding the need to factor p — 1. 

• Fetch a list of all primes from the datastructure, in the form of a sorted 
array s[], whose signatures match a. The fetch operation brings the Xp 
values along as well. 

We then use this algorithm to generate all candidate k values for a given t and 
v: 

Algorithm 3: Generating k Values 

Input: t > 2, search bound B, cutoff X < B, base vector u of length to. 

1: Let T denote our hash table datastructure 
2 : Let Qm+i denote the smallest prime not in u. 

3: for each prime p < \jB/a*~^-^ (with p — 1 in factored form) do 
4: Compute Xp from the factorization of p — 1; 

5: s[] := r.fetch((Tp); List of primes with matching signature 

6: for 0 < ii < • • • < it -2 < s.length do 

7: Form k = s[7i] • • • s[it- 2 ] ■ P', 

8: if k < X then 

9: Use k in a GCD computation to find matching pt values; 

10: else 

11: Use k for a sieving computation to find matching pt values; 

12: end if 

13: end for 

14: if p < (S/o(;C^i)i/3 then 

15: T.insert(p,crp,Ap); 

16: end if 

17: end for 


Note that the inner for-loop above is coded using t—2 nested for-loops in practice. 
This is partly why we wrote separate programs for each value of t. To make this 
work as a single program that handles multiple t values, one could use recursion on 
t. 

Also note that we split off the GCD and sieving computations into separate 
programs, so that the inner for-loop was coded to make sure we always had either 
k < X or A < k < B/p as appropriate. 

We implemented this datastructure as an array or table of 2™ linked lists, and 
used a hash function on the prime signature to compute which linked list to use. 
The hash function h : ct —?• 0..(2™ — 1) computes its hash value from the signature 
as follows: 
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• If (T = (0,0,0,..., 0), the all-zero vector, then h{a) = 0, and we are done. 

• Otherwise, compute a modified copy of the signature, a'. If cr contains only 
Os and Is, then cr' = cr. If not, find the largest integer entry in cr, and 
entries in cr' are 1 if the corresponding entry in cr matches the maximum, 
and 0 otherwise. 

For example, if cr = (3,4,0,4, 2,1, 2,4) (our example from earlier), then 

a'= (0,1,0,1, 0,0,0,1). 

• h{a) is the value of cr' viewed as an integer in binary. 

So for our example, h{{3, 4,0,4, 2,1, 2,4)) = OIOIOOOI 2 = 64 -|- 16 -|- 1 = 

81. 

Computing a hash value takes 0{m) time. 

Each linked list is maintained in sorted order; since the insertions are monotone, 
we can simply append insertions to the end of the list, and so insertion time is 
dominated by the time to compute the hash value, 0{m) time. Note that the 
signature and Xp is stored with the prime for use by the fetch operation. 

The fetch operation computes the hash value, and then scans the linked list, 
extracting primes (in order) with matching signature. If the signature to match is 
binary (which occurs roughly half the time), we expect about half of the primes 
in the list to match, so constructing the resulting array s[] takes time linear in 
m multiplied by the number of primes fetched. Note that A values are brought 
along as well. Amortized over the full computation, fetch will average time that is 
linear in the size of the data returned. Also note that the average cost to compare 
signatures that don’t match is constant. 

The total space used by the data structure is an array of length 2™, plus one 
linked list node for each prime < . Each linked list node holds the prime, its A 

value, and its signature, for a total of 0(2"* logi? -f bits, since each prime 

is OilogB) bits. 

2.5. Algorithm Outline. Now that all the pieces are in place, we can describe 
the algorithm as a whole. 

Given an input bound B, we compute the GCD/sieve cutoff X (this is discussed 
in below). 

We then compute GGDs as discussed above using candidate k values < X for 
t = 2, 3,.... For some value of t, we’ll discover that there are no candidate k values 
with t — 1 prime factors, at which point we are done with GGD computations. 

Next we sieve as discussed above, using candidate k values > X for t = 2, 3,.... 
Again, for some t value, we’ll discover there are no candidate k values with t — 1 
prime factors, at which point we are done. 

We wrote a total of eight programs; GGD programs for t = 2,3,4 and sieving 
programs for t = 2,3, 4, 5, 6. Also, we did not bother to use the wheel and signature 
sieving, relying only on A sieving, for t > 3. 

3. Algorithm Analysis 

3.1. Model of Computation. We assume a RAM model, with potentially infinite 
memory. Arithmetic operations on integers of 0(logi?) bits (single-precision) and 
basic array indexing and control operations all are unit cost. We assume fast, FFT- 
based algorithms for multiplication and division of large integers. For n-bit inputs, 
multiplication and division cost M{n) = O(nlognloglogn) operations [13]. From 
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this, an FFT-based GCD algorithm takes 0(M(n) log n) = 0(n(logn)^ loglogn) 
operations; see, for example, m 


3.2. Helpful Number Theory Results. Let our modulus q = Jla 

*^( 9 ) ~ ria Gi^s^ch base is an odd prime with 4){ai) = oi — 1, except for 

oi = 8 with (f>{ai) = 4. We use our two propositions from 12.3 together with 


quadratic reciprocity to obtain a list of (a^ — l)/2 residue classes for each odd 
prime and two residue classes modulo 8. The total number of residue classes is 
2“"*0(q). A prime we are searching for with a matching signature must lie in one 
of these residue classes. 

Let us clarify that when we state in sums and elsewhere that a prime p has 
signature equivalent to cr, we mean that the quadratic character (p/ai) is consistent 
with (T as stated in Propositions 2.5 and 2.6 We write a = ap to denote this 


consistency under quadratic character, which is weaker than a = < 7 p. 

Lemma 3.1. Given a signature a of length m, and let q = Yii^i above. Let 
e > 0. If q < then number of primes p < x with ap = a is at most 


O 


2"* log X 


Proof. This follows almost immediately from the Brun-Titchmarsh Theorem. See, 
for example, Iwaniec [5]. We simply sum over the relevant residue classes mod q 
based on quadratic character using Propositions |2.5| and |2.6[ □ 


We will choose m proportional to log B / log log B so that q = B"^ for some 
constant 0 < c < 1. 


Lemma 3.2. We have 

Y. - =0(2-™ log log x). 

p<X,( 7 p = (T ^ 

This follows easily from Lemma |3.1| by partial summation. We could make this 
tighter by making use of Theorem 4 from [3], and observe that the constants C{q, a) 
(in their notation) cancel with one another in a nice fashion when summing over 
different residue classes modulo q. See also section 6 of that paper. 

Let -Kt^uix) denote the number of integers < x with exactly t distinct prime 
divisors, each of which has signature equivalent to a of length m. 


Lemma 3.3. 




a:(log logo;)*’ 

2tm jQg j. 


Proof. Proof is by induction on t. t = 1 follows from Lemma 3.1 


For the general case, we sum over the largest prime divisor p of n, 


■XtAx) = Y ’^t-l.a^xjp) 

y- x(loglog(x/p))*-^ 
^ log(a:/p) 

p<x,ap=<7 

^ a: (log logx)‘~^ 

2tm ]^Qg ^ 
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We used Lemma 3.2 for the last step. 
§22.18].) 


(See also the proof of Theorem 437 in jS] 

□ 


Let Xp^m denote the value of Xp for the prime p using a vector v with the first 
m prime bases. We need the following lemma from [ini Cor. 2.4]: 

Lemma 3.4. There exists a eonstant r > 0 such that 

1 x^/im+l) 

^ ^ ^ ^gx)i+-7(m+ir 

Using this, we obtain the following. 


Theorem 3.5. Let 0 < 9 < 1. Then 


Proof. We have 


E 

<p<x 


1 


pX 


p.m 


1 

^ 2;e-l/(m+l) logj; 


E 

<p<x 


P^p,r, 


1 

- fie 2^ 


x^ <.p<x 


'p,m 


l/(m+l) 


< 


(log 


Note that r > 0. 


□ 


3.3. A Conjecture. In order to prove our i?2/3+o(i) j-^nning time bound for t > 2, 
we make use of the conjecture below. Let JC = JC{t,a) be the set of integers with 
exactly t distinct prime divisors each of which has signature matching a. In other 
words, TTt^a-ix) counts integers in K bounded by x. 


Conjecture 3.6. Let B,a,t,m,i/ be as above. Then 


k<X,k^K(t.C!) 

Theorem 3.7. Assuming Conjecture\3. 6\ is true, we have 


E 


P^k.m 


Proof. The proof is essentially the same as that of Theorem [3.5 
Conjecture |3.6| for Lemma [T4| 


Simply substitute 
□ 


3.4. Proof of Theorem |1.2[ We now have the pieces we need to prove our running 
time bound. 

As above, each pseudoprime candidate we construct will have the form n = kpt, 
where k is the product of t — 1 distinct primes with matching signature. Again, 
v = (2,3, 5,...) is our base vector of small primes of length m. 
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3.4.1. t = 2. In this case k is prime. 

GCD Computation. For each prime k = p < X we perform a GCD computa¬ 
tion as described in §2.2| The bottleneck of this computation is computing GCDs 
of 0(p)-bit integers. This gives a running time proportional to 

'^M{p)\ogp < 7r(X)X(logX)^ loglogX 

p<X 

<C X'^ log X log log X. 


Sieving Computation. For each prime k = p with X < p < B/p^ we sieve 
the interval [p-\- 1, B/p] for primes that are = 1 mod Ap. We also employ signature 
sieving, but for the simplicity of analysis, we omit that for now. Using the methods 
from |15j , we can sieve an arithmetic progression of length I m.O{i log B) operations. 
We do not need proof of primality here, so a fast probabilistic test works fine. This 
gives a running time proportional to 

E E 

X<p<B/p ^ P X<p<B/X^ P 


At this point we know < B and, to keep the GG D and sieving computations 
balanced, X > say. This means Theorem |3.5| applies; we set = X and 

X = BjX to obtain 


U log U B log U 

^ pAp 


X<p<BIX 


(5/X)l/(™+l) 

Alog(B/A) 


Ji. 


assuming that to —>■ oo with B. 

Minimizing implies X = and gives the desired 

running time of This completes our proof. □. 


3.4.2. t > 2. In this case k is composite. 

GCD Computation. For t > 2 we construct integers k = rp for computing 
GGDs, with r consisting of exactly t — 2 prime divisors less than p, with signatures 
matching Up. For each prime p, we perform a hash table lookup and fetch the list of 
such primes; this is Step 5 of Algorithm 3. This overhead cost is 0(TO)-|-0(7ro.p(p)) = 
0{m + 2“'"p/ logp). Summing this over all primes p < X gives 0(2“™A^/ log A). 

Next, for each prime p < A we construct at most 7rt_2,o-p (A/p) values for r (Step 
7 of Algorithm 3), and using Lemma 3.3 and multiplying by the cost of computing 
the GCD, this gives us 


(3.1) 


p<X 


,(A/p)-M(A)logA<A- 


/ log log A A 

I 2™ y 


t-2 


log A 


for the total running time. 

Sieving Computation. Again, the main loop enumerates all choices for the 
second-largest prime p = pt-i- First, we construct all possible A:-values with fc > A, 
k < B/p, and p | fc, so fc/p = r = pi • • ■pt -2 with all the pi < Pi+i and Pt -2 < P- 
We also have pt -2 < B^/^. This implies p > 
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For a given p, fetching the list of primes below u := min{p, B/p^} with matching 
signatures takes 0(m + u/(2’" logu)) time (Algorithm 3 Step 5). Summing over all 
p < y/B, splitting the sum at p = B^^^, this takes time. 

As above, we write r = k/p. We claim that the total number of k values is 
2-m(i-2)^2/3+o(i)^ ^ above. There are at most 1 ^ 1 - 2 ,ap{u) values of r for 


each p. Again, splitting this at 

we have 

E T^t-2,ap{p) 

Xi/{t-i)<p<_Bi/3 

< 

p(loglogp)‘ 3 
2d“^l™losp 

Ai/(t-i)<p<Bi/3 


< 

2—m{t — 2)^2/3-\-o(l) 

and 



E '^t-2,ap{Blp^) 

b^/3<p<Vb 

< 

^ B{\oglogBY~^ 

p22(t—2)m lop-^ 

B1/3<p<VB 


< 

1 H(loglogH)*-3 

B^/^\ogB 21*“^)™ logi? 


< 

2 ” rri{t—2)^2/3+o(l) 


This covers work done at Step 7 of Algorithm 3. 
Finally, the cost of sieving is at most 

i?logi? 


E E 

Ai/(t-i)<p<VB 


rpX. 


rp 


If we use Theorem |3.7[ based on our conjecture, and using the lower bound X < 
k = rp, this leads to the bound 

t-2 ^l + l/(m+l)+o(l) 


(3.2) 


/log log g 

V 2™ 


A 


Without the conjecture, we use Lemma [372] together with the argument outlined in 
Hardy and Wright [5] in deriving (22.18.8), we obtain that 

t-2 


1 f log log B 

r ^ \ 2™ 


We then use 


A 


1 

< — . 


rp 


As above, this leads to the bound 

/ log log B 


(3.3) 




^l+l/(m+l) 


To balance the cost of GCDs with sieving, we want to balance (|3.1[) with (|3.2[) or 


(3.3), depending on whether we wish to assume our conjecture or not. Simplifying 

a bit, this means balancing A2 +°(i) with either (H/A)i+°(i) or 

The opti mal cutoff point is then either X = B^^^ under the assumption of Conjec¬ 


ture 


3.6 or A = B^* i)/(2t 1 ) unconditionally, for a total time of 

t-2 

q2/3+o{1) 


f log log B 
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with our conjecture, or 


/ logiogg y 

V 2™ y 


without. We have proven the following. 

Theorem 3.8. Assuming Conjecture 3.6. our algorithm takes time to 

find all integers n < B that are strong pseudoprimes to the first m prime bases, if 
m —> oo with B. 


If we choose m so that q is a fractional power of B, then Lemma 3.3 implies 
that there is a constant c > 0 such that if t > clog log i3, then there is no work 
to do. This explains why our computation did not need to go past t = 6. It also 
explains why, in practice, the t = 2 case is the bottleneck of the computation, and 
is consistent with the implications of our conjecture that the work for each value 
of t decreases as t increases. 


4. Implementation Details 

In this final section, we conclude with some details from our algorithm imple¬ 
mentation. 

4.1. Strong Pseudoprimes Found. In addition to our main results for ■012 and 
013, we discovered the following strong pseudoprimes. This first table contains 
pseudoprimes found while searching for 0 i 2 


n 

Factorization 

3825123056546413051 

747451- 149491 -34233211 

230245660726188031 

787711- 214831-1360591 

360681321802296925566181 

424665351661-849330703321 

164280218643672633986221 

286600958341-573201916681 

318665857834031151167461 

399165290221-798330580441 

7395010240794120709381 

60807114061-121614228121 

164280218643672633986221 

286600958341-573201916681 


This second table contains pseudoprimes found while verifying 0i3. 


n 

Factorization 

318665857834031151167461 

399165290221-798330580441 

2995741773170734841812261 

1223875355821-2447750711641 

667636712015520329618581 

577770158461-1155540316921 

3317044064679887385961981 

1287836182261-2575672364521 

3317044064679887385961981 

1247050339261-2494100678521 

552727880697763694556181 

525703281661-1051406563321 

360681321802296925566181 

424665351661-849330703321 

7395010240794120709381 

60807114061-121614228121 

3404730287403079539471001 

1304747157001-2609494314001 

164280218643672633986221 

286600958341-573201916681 


4.2. Hash Table Datastructure. It is natural to wonder how much time is 
needed to manage the hash table datastructure from §2.4| Specifically, we mea¬ 
sured the time to create the table when inserting all primes up to a set bound. 
For this measurement, we used a vector v with the first 8 prime bases. We also 
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Figure 1. GCD timings 


measured how long it took, for each prime up to the bound listed, to also perform 
the fetch operation, which returns a list of primes with matching signature with A 
values. 


Largest Prime 

Table Entries 

Greation Time 

Fetching Time 

10*^ 

78490 

1.13 

2.58 

10' 

664571 

11.49 

2:44.49 

10® 

5761447 

1:56.22 

3:30:50.2 


The times are given in seconds, or minutes:seconds, or hours:minutes:seconds as 
appropriate. 

We felt the data structure was a success, in that it was fast and small enough 
that it was not noticed in the larger computation. 

4.3. Almost-Linear GCDs. In Figure we present the times to perform Algo¬ 
rithm 1 on selected prime values for k (so this is for the t = 2 case). The data 
points were obtained by averaging the time needed to carry out Algorithm 1 for the 
first ten primes exceeding n ■ 10^ for 1 < n < 35. Averaging was required because 
the size of the h values is not uniformly increasing in k. This explains the variance 
in the data; for example, the first ten primes after 35 • 10^ happened to have smaller 
h values on average than the first ten primes after 34 • 10^. 

It should be clear from the data that our GMP algorithm for computing GGDs 
was using an essentially linear-time algorithm. Note that if we chose to extend the 
computation to large enough k values, memory would become a concern and we 
would expect to see paging/thrashing degrade performance. 

4.4. GCD/Sieving Crossover. In Figure]^ we present a comparison of the tim¬ 
ings for computing GGDs (Algorithm 1, diamonds on the graph) with signature 
sieving (Algorithm 2, squares on the graph). For this graph, we are looking at 
choosing the crossover point X, and again, we are focusing on the t = 2 case. 

• One would expect that the signature sieving curve should behave as an 
inverse quadratic; the time to sieve for one prime k = p is proportional to 

BlogB 

pXp 
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Figure 2. Signature Sieving and GCD timing 

and we expect Ap « p — 1 most of the time. However, utilizing signatures 
obscures this to some degree, since the algorithm cannot make use of all 
prime bases if they divide A, hence the minor variations in the curve. Let 
us elaborate this here. 

The data points for signature sieving in Figure represent the average 
sieve time for the first 50 primes past n ■ 10^ for 10 < n < 60. There is 
a lot of variance in these timing because of the inclusion and exclusion of 
primes from the signature depending if they are relatively prime to A. For 
example, p = 174594421 has A = 2^ • 3^ • 5 • 11 • 13 • 17 • 19 and therefore has 
few primes from the base set to use in signature sieving. The inability to 
add extra primes shows up in the timing data; a careful inspection of the 
data shows a strange jump when primes transition from 28 • 10^ to 29 • 10^. 
The data points to a steady decrease, then an almost two fold increase in 
time followed by a steady decrease again. We believe this is because, on 
average, one less base prime may be used in signature sieving. Using one 
less base prime results in approximately twice as much work. 

• In the computation for '4>i2, our actual crossover point X was 30 • 10^, even 
though the timing data would suggest the optimal crossover point is around 
12 • 10^. From a cost-efficiency point of view, we chose poorly. However, 
the choice was made with two considerations. One is that the program 
to compute GCDs (Algorithm 1) is simple to write, so that program was 
up and running quickly, and we let it run while we wrote the sieving code 
(Algorithm 2). Second is that, initially, we did not know which ipm value 
we would ultimately be able to compute. Since the results from GGD 
computations apply to all larger values of m, we opted to err in favor of 
computing more GGDs. 

4.5. Signature Sieving. Figurej^shows the impact that signature sieving makes. 

Here the squares in the graph give A-sieving times with no signature information 
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Figure 3. Signature Sieving and A-Sieving 

used, and the diamonds show the same A-sieving work done while taking advantage 
of signature information using the space-saving wheel. Since there is relatively little 
variance involved in A-sieving, each point represents the first prime after n ■ 10^ for 
10 < n < 60. On the same interval signature sieving practically looks like it takes 
constant time compared to A-sieving. If we had opted to not incorporate signature 
sieving in, then the expected crossover point given these timings would occur when 
the primes are around 42 • 10^. 
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